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AMENDMENTS TO THE CLAIMS 

1-27 (canceled) 

28. (original) A method facilitated by a human annotator and performed in a computer 
environment for normalizing a score associated with a document, the method comprising the 
steps of: 

(a) establishing (1) through the computer environment a set of training documents most 
of which are believed not to be relevant to a topic (off-topic) and (2) through the human 
annotator a query relevant to the topic (on-topic); 

(b) assigning, through the computer environment, a training document relevance score to 
each one of the training documents, each training document relevance score representing a 
measure of relevance of its respective document to the topic; 

(c) determining, through the computer environment, statistics relating to all training 
document relevance scores and thereby obtaining determined statistics; 

(d) receiving a testing document; 

(e) calculating, through the computer environment, a score of relevance of the testing 
document to the topic to obtain a testing document relevance score; 

(f) normalizing, through the computer environment and based on the statistics, the testing 
document relevance score to obtain a normalized score wherein: 

normalizing adjusts the testing document relevance score based on the statistics to 
be comparable to other scores from which the statistics were determined, and 

the normalized score is a better predictor of probability of the testing document 
being relevant than the testing document relevant score; 

(g) establishing, through the computer environment, a threshold score representing a 
relevance threshold for the topic; 

(h) comparing the normalized score to the threshold score to obtain a comparison; 

and 

(i) designating the testing document as relevant or not relevant to the topic based on the 
comparison. 
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29. (original) The method of claim 28, wherein the statistics include a mean score of the 
training documents not relevant to the topic and a standard deviation of the scores assigned to 
the set of training documents not relevant to the topic. 

30. (original) The method of claim 29, wherein said normalizing step determines the 
normalized score according the following formula: 

normalized score — (s - jXoffjopic) / 0* offjopic 

wherein s represents the score assigned to the testing document, ^ 0 ffjo P ic represents the 
mean score of the documents not relevant to the topic, and a 0 fif_topic represents the standard 
deviation of the scores assigned to the set of training documents not relevant to the topic. 

31. (original) The method of claim 28, said determining step further comprising: 
determining statistics relating to scores assigned to a set of training documents relevant 

to the topic. 

32. (original) The method of claim 31, wherein the statistics relating to scores assigned to 
a set of training documents relevant to the topic include a mean score of the documents relevant 
to the topic and a standard deviation of the scores assigned to the set of training documents 
relevant to the topic. 

33. (original) The method of claim 32, wherein said normalizing step comprises: 
normalizing a score assigned to a testing document based on the statistics relating to the scores 
assigned to the set of training documents not relevant to the topic and based on the statistics 
relating to the scores assigned to the set of training documents relevant to the topic. 

34. (original) The method of claim 33, wherein said normalizing step determines the 
normalized score according the following formula: 

normalized_score = f on _to P ic * ((s - ^ 0 ff_to P ic) / a offjopic) 

wherein f on _topic represents a scale factor based on the statistics relating to the scores 
assigned to the set of training documents relevant to the topic, s represents the score assigned to 
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the testing document, ^ 0 fr_topic) represents the mean score of the documents not relevant to the 
topic, and atopic represents the standard deviation of the scores assigned to the set of training 
documents not relevant to the topic. 

35. (original) The method of claim 28, wherein said designating step comprises: 
designating the testing document as relevant to the topic based on a determination that 

the normalized score is greater than the threshold score; and 

designating the testing document as not relevant to the topic based on a determination 
that the normalized score is not greater than the threshold score. 

36. (original) The method of claim 28, further comprising: repeating steps (a)-(d) for a 
plurality of topics. 

37. (original) The method of claim 28, further comprising: repeating steps (a)-(d) for a 
plurality of testing documents. 

38. (original) The method of claim 28, wherein the statistics include a first robust 
estimate of a mean score of the set of training documents not relevant to the topic and a second 
robust estimate of a standard deviation of the scores assigned to the set of training documents 
not relevant to the topic. 

39. (original) The method of claim 38 wherein the first robust estimate comprises: 
setting a first robust estimate threshold, based on the determined statistics; 

removing each document of the set of training documents not relevant to the topic which 
is above the first robust estimate threshold, thereby creating a remaining set of training 
documents not relevant to the topic; and 

determining new estimates of the statistics that may be more appropriate for off topic 
documents based solely on the remaining set of training documents. 

40. (original) The method of claim 38 wherein the second robust estimate comprises: 
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* 

setting a second robust estimate threshold, based on the determined statistics; removing 
each document of the set of training documents not relevant to the topic which is above the 
second robust estimate threshold, thereby creating a remaining set of training documents not 
relevant to the topic; and 

determining new estimates of the statistics that may be more appropriate for offtopic 
documents based solely on the remaining set of training documents. 

41 . (original) The method of claim 39 wherein the second robust estimate comprises: 
setting a second robust estimate threshold, based on the determined statistics; removing 

each document of the set of training documents not relevant to the topic which is above the 
second robust estimate threshold, thereby creating a remaining set of training documents not 
relevant to the topic; and 

determining new estimates of the statistics that may be more appropriate for offtopic 
documents based solely on the remaining set of training documents. 

42. (currently amended) A computer-readable storage medium containing instructions for 
performing a method in a computer environment for normalizing a score associated with a 
document, the method facilitated by a human annotator comprising: 

(a) establishing (1) through the computer environment [-readable medium] a set of 
training documents most of which are believed not to be relevant to a topic (off-topic) and (2) 
through the human annotator a query relevant to the topic (on-topic); 

(b) assigning, through the computer environment, a training document relevance score to 
each one of the training documents, each training document relevance score representing a 
measure of relevance of its respective document to the topic; 

(c) determining, through the computer environment, statistics relating to all training 
document relevance scores; 

(d) receiving a testing document; 

(e) calculating, through the computer environment, a score of relevance of the testing 
document to the topic to obtain a testing document relevance score; 

(f) normalizing, through the computer environment and based on the statistics, the testing 
document relevance score to obtain a normalized score wherein: 
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normalizing adjusts the testing document relevance score based on the statistics to be 
comparable to other scores from which the statistics were determined, and 

the normalized score is a better predictor of probability of the testing document being 
relevant than the testing document relevant score; 

(g) establishing, through the computer environment, a threshold score representing a 
relevance threshold for the topic; 

(h) comparing the normalized score to the threshold score to obtain a comparison; 
and 

(i) designating the testing document as relevant or not relevant to the topic based on the 
comparison. 

43. (currently amended) The computer-readable storage medium of claim 42, wherein the 
statistics include a mean score of the training documents not relevant to the topic and a standard 
deviation of the scores assigned to the set of training documents not relevant to the topic. 

44. (currently amended) The computer-readable storage medium of claim 43, wherein 
said normalizing step determines the normalized score according the following formula: 

normalized_score = (s - ^ 0 fr_topic) / (<J offjopic) 

wherein s represents the score assigned to the testing document, ji 0 ff_to P ic represents the 
mean score of the documents not relevant to the topic, and o offjopic 

represents the standard 
deviation of the scores assigned to the set of training documents not relevant to the topic. 

45. (currently amended) The computer-readable storage medium of claim 42, said 
determining step further comprising: 

determining statistics relating to scores assigned to a set of training documents relevant 
to the topic. 

46. (currently amended) The computer-readable storage medium of claim 45, wherein the 
statistics relating to scores assigned to a set of training documents relevant to the topic include a 
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mean score of the documents relevant to the topic and a standard deviation of the scores assigned 
to the set of training documents relevant to the topic. 

47. (currently amended) The computer-readable storage medium of claim 46, wherein 
said normalizing step comprises: 

normalizing a score assigned to a testing document based on the statistics relating to the 
scores assigned to the set of training documents not relevant to the topic and based on the 
statistics relating to the scores assigned to the set of training documents relevant to the topic. 

48. (currently amended) The computer-readable storage medium of claim 47, wherein 
said normalizing step determines the normalized score according the following formula: 

normalizedscore = f on jopic * ((s - Hoffjopic) / o 0 ff_to P ic) 

wherein f on _topic represents a scale factor based on the statistics relating to the scores 
assigned to the set of training documents relevant to the topic, s represents the score assigned to 
the testing document, jx 0 fT_to P ic represents the mean score of the documents not relevant to the 
topic, and a 0 fr_topic represents the standard deviation of the scores assigned to the set of training 
documents not relevant to the topic. 

49. (currently amended) The computer-readable storage medium of claim 42, wherein 
said designating step comprises: 

designating the testing document as relevant to the topic based on a determination that the 
normalized score is greater than the threshold score; and 

designating the testing document as not relevant to the topic based on a determination 
that the normalized score is not greater than the threshold score. 

50. (currently amended) The computer-readable storage medium of claim 42, further 
comprising: repeating steps (a)-(d) for a plurality of topics. 

51. (currently amended) The computer-readable storage medium of claim 42, further 
comprising: repeating steps (a)-(d) for a plurality of testing documents. 
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52. (currently amended) The computer-readable storage medium of claim 42, wherein the 
statistics include a robust estimate of a mean score of the training documents not relevant to the 
topic and a robust estimate of a standard deviation of the scores assigned to the set of training 
documents not relevant to the topic. 

53. (original) A method, facilitated by a human annotator and performed in a computer 
environment for normalizing a score associated with a document, the method comprising the 
steps of: 

(a) receiving (1) through the computer environment a set of training documents not 
relevant to a topic (off-topic) and (2) through the human annotator a query including the topic 
(on-topic); 

(b) assigning, through the computer environment, a training document relevance score to 
each one of the training documents, each training document relevance score representing a 
measure of relevance of its respective document to the topic; 

(c) determining, through the computer environment, statistics relating to all training 
document relevance scores; 

(d) receiving a testing document; 

(e) calculating, through the computer environment, a score of relevance of the testing 
document to the topic to obtain a testing document relevance score; and 

(f) normalizing, through the computer environment and based on the statistics, the testing 
document relevance score to obtain a normalized score wherein: 

normalizing adjusts the testing document relevance score based on the statistics to be 
comparable to other scores from which the statistics were determined, and 

the normalized score is a better predictor of probability of the testing document being 
relevant than the testing document relevant score. 

54. (original) The method of claim 53, further comprising: 

designating the testing document as relevant or not relevant to the topic based on the 
normalized score. 

55. (original) The method of claim 53, further comprising: comparing the normalized 
score to a threshold score; and 



9854532-1 



Page8of 11 



Application No. 1 0/665056 Docket No.: BBNT-P02-283 

Amendment dated January 24, 2006 

Reply to Office Action of September 26, 2005 

designating the testing document as relevant or not relevant to the topic based on 
the comparison. 

56. (original) A method facilitated by a human annotator and performed by a processor in 
a computer environment for searching for documents relevant to a topic comprising the steps of: 

establishing through the computer environment a set of training documents not relevant 
to the topic (off-topic); 

the human annotator sending a query including the topic (on-topic) to the processor; and 

the human annotator receiving results from the processor indicating a document relevant 
to the topic, wherein the processor: 

assigns, through the computer environment, a training document relevance score to each 
one of the training documents, each training document relevance score representing a measure of 
relevance of its respective document to the topic; 

determines, through the computer environment, statistics relating to all training 
document relevance scores; 

receives a testing document; 

calculates, through the computer environment, a score of relevance of the testing 
document to the topic to obtain a testing document relevance score; normalizes, through the 
computer environment and based on the statistics, the testing document relevance score to obtain 
a normalized score wherein: 

normalizing adjusts the testing document relevance score based on the statistics to be 
comparable to other scores from which the statistics were determined, and 

the normalized score is a better predictor of probability of the testing document being 
relevant than the testing document relevant score; establishes, through the computer 
environment, a threshold score representing a relevance threshold for the topic; compares the 
normalized score to the threshold score to obtain a comparison; and 

designates the testing document as relevant or not relevant to the topic based on the 
comparison. 
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